fix(runtime, channels): unify session backend behind one factory#6384
Conversation
The gateway WebSocket handler persists sessions to SqliteSessionBackend (sessions/sessions.db, with gw_-prefixed keys) and the gateway REST API reads from the same place. The session tools (sessions_list, sessions_history, sessions_send) were initialized with the JSONL SessionStore, so they looked at sessions/*.jsonl files the gateway never writes. Result: gateway sessions are invisible to the agent. sessions_list returns "No active sessions found" even when the SQLite store has gw_ entries, and sessions_history returns "No messages found" for any gateway session ID. The agent then hallucinates session IDs because the tool returns empty data. Switch the registration site at crates/zeroclaw-runtime/src/tools/mod.rs to SqliteSessionBackend. Both backends implement the SessionBackend trait with the same new(&Path) signature, so this is a one-line swap plus a comment update. Closes zeroclaw-labs#5769.
The original PR swapped the runtime tools to SqliteSessionBackend without touching the channel orchestrator, which still hardcoded the JSONL SessionStore. That trades one half-blindness for another: sessions written by channels (Telegram, Slack, Discord, ...) become invisible to the agent's sessions_list / sessions_history tools. Introduce zeroclaw_infra::make_session_backend(workspace_dir, &str) that picks JSONL or SQLite based on [channels].session_backend (which the schema already exposes; nothing was reading it). Both call sites (channel orchestrator session_store, runtime tool registration) now go through this factory, so the configured backend is the single source of truth and channel + WS sessions both surface in the same store. The orchestrator's session_store field switches from Option<Arc<SessionStore>> to Option<Arc<dyn SessionBackend>>. The mtime-based hydration sort is replaced with the trait's list_sessions_with_metadata() last_activity field, which both backends already populate, so there is no SessionStore-specific method on the hot path. Closes zeroclaw-labs#5769
Audacity88
left a comment
There was a problem hiding this comment.
Reviewed current PR head eb9d5b6. I checked live PR state, linked issue #5769, overlapping PR #5770, CI status, formal reviews/comments, the full diff, current labels, and the adjacent session backend call sites in runtime, channels, gateway, config, and infra. CI is green, current labels are bug, size: XS, and risk: high, and I did not find prior reviews or inline comments on this PR.
🟢 What looks good — the shared factory is the right direction
This fixes the important limitation in #5770's one-line SQLite swap: channel sessions and runtime session tools should not be hardwired to different stores. Moving the runtime tools and channel orchestrator toward a shared SessionBackend factory is the right shape.
🔴 Blocking — session_backend = "jsonl" still splits gateway sessions from the tools
The PR body says the configured backend becomes the single source of truth and that channel + WS sessions both surface in the same store. That is only true for the default SQLite path.
The gateway still always opens SqliteSessionBackend for /ws/chat persistence, while the runtime tools now open make_session_backend(workspace_dir, config.channels.session_backend). If an operator sets [channels].session_backend = "jsonl", the tools read JSONL, channels read JSONL, but gateway WS sessions are still written to SQLite. That recreates the original #5769 failure for gateway sessions: sessions_list / sessions_history will not see /ws/chat sessions.
Could you either route the gateway through the same backend decision too, or narrow the config contract so jsonl is explicitly channel-only and the runtime tools cannot claim gateway parity under that setting?
🔴 Blocking — existing channel JSONL histories stop hydrating under the default config
On master, the channel orchestrator always opens the JSONL SessionStore when channels.session_persistence is enabled. This PR changes that to make_session_backend(...), whose default is "sqlite". For existing users who never set session_backend, their channel sessions were written as sessions/*.jsonl, but after this change the default channel hydration path reads sessions/sessions.db instead.
I found SqliteSessionBackend::migrate_from_jsonl(...), but this PR does not call it. That means existing channel history can silently disappear from startup hydration and from the session tools after upgrade, unless the operator happens to add session_backend = "jsonl" manually. The PR body's compatibility and rollback notes say no migration is required because each backend reads its own format, but the default behavior changes which format gets read.
Could you add an explicit migration/compatibility path for existing JSONL channel sessions, or preserve the legacy JSONL default for channels until there is a deliberate migration step?
🟡 Warning — explicit JSONL mode loses the old mtime-based hydration ordering
The channel startup hydration used to sort JSONL session keys by file mtime before truncating to MAX_CONVERSATION_SENDERS. This PR replaces that with SessionBackend::list_sessions_with_metadata().last_activity, but SessionStore does not override list_sessions_with_metadata(). It gets the trait default, which sets last_activity = Utc::now() for every key.
For operators who explicitly choose session_backend = "jsonl", startup hydration can now pick an arbitrary subset once there are more than MAX_CONVERSATION_SENDERS persisted sessions. That is smaller than the migration/default issue above, but it is the same compatibility surface: JSONL mode should keep the old mtime ordering or provide real metadata.
🟡 Warning — this should acknowledge the superseded contributor PR
#6384 appears to replace #5770 by @Lior-Dav. It keeps the same linked issue and carries forward the core diagnosis and initial one-line SQLite-tool fix, then expands it into a fuller backend factory. Since #5770 is still open and maintainerCanModify is true, the superseding path should be explicit: either update #5770 directly, or mark #6384 as superseding it and include the appropriate PR-body attribution / co-author handling for any incorporated work.
WareWolf-MoonWall
left a comment
There was a problem hiding this comment.
Review draft — PR #6384
fix(runtime, channels): unify session backend behind one factory
Proposed verdict: --comment
Take-stock
Audacity88 holds active CHANGES_REQUESTED (posted 2h ago on head commit eb9d5b6) with
2 🔴 blocking findings and 2 🟡 warnings. No other reviewer has weighed in. I cannot
approve over an active block. I'm adding my own read of the diff, confirming Audacity88's
findings against local source, and raising two new observations.
What I looked at
- Full diff (orchestrator/mod.rs, zeroclaw-infra/src/lib.rs, zeroclaw-runtime/src/tools/mod.rs)
- Local source:
crates/zeroclaw-gateway/src/lib.rs—run_gatewaysession init path - Local source:
crates/zeroclaw-infra/src/session_backend.rs—list_sessions_with_metadatadefault impl - Local source:
crates/zeroclaw-infra/src/session_sqlite.rs—migrate_from_jsonl - Local source:
crates/zeroclaw-config/src/schema.rs—default_session_backend() - Open PR #5770 by @Lior-Dav (still open, same linked issue #5769)
🟢 [praise] The factory design is right. A single make_session_backend entry point
that both the channel orchestrator and runtime tools route through is the correct fix for
the root problem in #5769: the session gap was a shared-state boundary problem, not a
per-call-site problem, and the fix belongs at the boundary. The unknown-value fallback
with tracing::warn! is good defensive coding — a config typo can never silently disable
persistence, it just falls back to a safe default with an explanatory warning (per
FND-006 §4.1).
Audacity88's active findings — confirmed
🔴 [blocking — confirmed from source] Gateway split under session_backend = "jsonl"
zeroclaw-gateway/src/lib.rs at the run_gateway session-init path (around L741)
unconditionally opens SqliteSessionBackend::new(&config.workspace_dir) whenever
config.gateway.session_persistence is true. make_session_backend is not wired into
the gateway. So for any operator who sets [channels].session_backend = "jsonl", the
runtime tools and channel orchestrator will read JSONL, but the gateway /ws/chat path
continues to write SQLite. sessions_list / sessions_history will not surface gateway
sessions — the original #5769 failure mode, now on a different backend pairing.
Audacity88's suggested resolution paths are both viable: route the gateway through the
same factory (requires threading the backend decision into run_gateway, whether via
channels.session_backend or a dedicated gateway.session_backend field), or explicitly
narrow the contract so jsonl is documented as channel-only and the PR body removes the
gateway-parity claim for that mode.
🔴 [blocking — confirmed from source] Existing JSONL sessions go dark on upgrade
config/schema.rs defines default_session_backend() returning "sqlite", and
ChannelsConfig::default() sets session_backend: default_session_backend(). Any
deployment without an explicit [channels] session_backend = "jsonl" in their config
will have config.channels.session_backend == "sqlite" — which was always the case,
but before this PR the channel orchestrator ignored that field and always opened
SessionStore (JSONL). This PR makes the field load-bearing: the orchestrator now calls
make_session_backend(..., "sqlite") for the default config. Existing JSONL session
files in sessions/*.jsonl become invisible to startup hydration.
SqliteSessionBackend::migrate_from_jsonl exists and is tested in
session_sqlite.rs — but this PR does not call it. The PR compatibility note
("Backward compatible? Yes for the default config") describes a new-deployment guarantee,
not an upgrade guarantee. For upgrading users on the default config, this is a silent
data-visibility regression.
Resolution options: (a) call migrate_from_jsonl during make_session_backend (or in
start_channels after backend init) when .jsonl files are present in the workspace;
(b) preserve the JSONL default for channels until a deliberate, announced migration step
ships; or (c) make migrate_from_jsonl a documented manual step with a startup warning
when JSONL files are detected in a SQLite-configured workspace.
Audacity88's warnings also hold
🟡 [warning — confirmed] JSONL hydration ordering regresses under explicit jsonl mode
The list_sessions_with_metadata default implementation in session_backend.rs sets
last_activity: Utc::now() for every session key (because SessionStore does not
override this method and the trait default calls self.load(&key) and then timestamps
with Utc::now()). The new hydration sort — sort_by_key(|m| Reverse(m.last_activity))
— therefore produces an arbitrary order for JSONL deployments once more than
MAX_CONVERSATION_SENDERS sessions are persisted. The previous mtime-based ordering
used store.session_mtime(&k), which reads the actual on-disk file modification time.
SessionStore already exposes session_mtime as a concrete method. Implementing
list_sessions_with_metadata on SessionStore using that method would fix the
regression while keeping the trait abstraction intact.
🟡 [warning — confirmed] Supersede attribution for #5770
PR #5770 by @Lior-Dav is still open against the same issue (#5769) and carries the
initial diagnosis and one-line SQLite tool-registration fix that this PR expands into a
full factory. Per the project workflow (AGENTS.md, "Supersedes #" rules), the handling
should be explicit: either mark this PR Supersedes #5770 with Co-authored-by if
code or design was carried forward, or coordinate with @Lior-Dav to update or close
#5770 before this lands. The current state leaves both PRs open on the same issue with
overlapping scope, which creates a merge-ordering ambiguity.
New findings
🟡 [warning — new] make_session_backend has no tests
This is now the single load-bearing factory for both the channel orchestrator and the
runtime tool registration. The three match arms ("jsonl", "sqlite", unknown fallback)
each affect different user configurations, and the fallback behavior (unknown value →
SQLite + warning rather than error) is a deliberate user-visible contract. Unit tests for
the factory — using tempdir as the existing backend tests do — would be appropriate,
especially given that the default behavior matters for upgrade compatibility. A test that
exercises each arm and verifies the returned backend type (or at least that it
successfully loads/saves a session) would close the gap.
🔵 [note] Compatibility claim in the PR body
The "Backward compatible? Yes for the default config (session_backend = 'sqlite')"
note will be read by reviewers as "upgrading users are safe." It means the opposite for
any upgrading user on the default config who has existing JSONL channel sessions.
Once the migration/blocking issue above is resolved, updating this note to reflect the
actual upgrade contract ("yes for new deployments or explicitly-SQLite configs; upgrade
path for existing JSONL sessions requires X") would make the PR body accurate for the
release notes and changelog.
The factory shape is right and the architectural direction is correct per FND-001 §4.1.
Looking forward to the follow-up on the two blocking issues.
|
@JordanTheJet — milestone alignment needed: this PR does not clearly fit within the scope boundary of any open milestone. Please advise on placement or deferral. |
…time sort Per @Audacity88 / @WareWolf-MoonWall review: 1. Gateway `/ws/chat` session persistence now routes through `make_session_backend(workspace_dir, &config.channels.session_backend)` instead of unconditionally opening `SqliteSessionBackend`. With `session_backend = "jsonl"` the gateway, channels, and tools all read the same store; the previous gateway-only SQLite hardcode would have recreated the original session-split bug under that config. 2. `make_session_backend` opens SQLite via a new `open_sqlite_with_jsonl_import` helper that calls `SqliteSessionBackend::migrate_from_jsonl` on first open. Legacy `sessions/*.jsonl` files get imported and renamed to `*.jsonl.migrated`, so an upgrading deployment on the default config doesn't lose channel history. Migration errors log + skip rather than blocking startup. 3. `SessionStore` now overrides `list_sessions_with_metadata` to fill `last_activity` from `session_mtime` instead of inheriting the trait default (which stamped every key with `Utc::now()`). The orchestrator's hydration sort now picks the most-recent JSONL sessions deterministically when truncating to MAX_CONVERSATION_SENDERS. 4. New tests cover all three `make_session_backend` arms (jsonl, sqlite, unknown→sqlite fallback) plus the JSONL→SQLite import flow with the `*.jsonl.migrated` rollback file.
theonlyhennygod
left a comment
There was a problem hiding this comment.
First review on eb9d5b657e9337d86a221c7d8037fb384f064369. No prior formal reviews, no inline threads, no top-level comments. Closes #5769. Read the diff against master, the SessionBackend trait + both backends, the schema field, and the orphan-rollback test path.
The bug diagnosis is exact. The factory shape is the right answer.
What's working well 🟢
- The framing in the body — "Naively swapping the tool registration to SQLite would have traded one half-blindness for another" — is the right diagnosis. The factory exists because the bug isn't "wrong default chosen," it's "two call sites assuming different backends." The fix has to live above both, not at either one.
- Type-erasing the orchestrator's
session_storefromOption<Arc<SessionStore>>toOption<Arc<dyn SessionBackend>>is what makes the rest of the change actually unify anything. Without it the factory would have been window-dressing. SessionMetadata::last_activitywas already on the trait and SQLite was already populating it — removing the JSONL-onlysession_mtime()side call from the hot path and going through the trait is the right shape.- Validation evidence cites the exact test that exercises the new shape (
rollback_orphan_user_turn_also_removes_from_session_store). I appreciate the literal command list.
🟡 [warning]
1. JSONL hydration order is now arbitrary
The new sort:
let mut metadata = store.list_sessions_with_metadata();
metadata.sort_by_key(|m| std::cmp::Reverse(m.last_activity));depends on each backend reporting an actual last_activity. SQLite does — SqliteSessionBackend overrides list_sessions_with_metadata (crates/zeroclaw-infra/src/session_sqlite.rs:303-339) and reads it from session_metadata.last_activity.
JSONL doesn't. SessionStore at crates/zeroclaw-infra/src/session_store.rs:157-185 implements only the required trait methods and inherits the default list_sessions_with_metadata from crates/zeroclaw-infra/src/session_backend.rs:65-80, which constructs last_activity: Utc::now() for every key. After this PR, JSONL hydration:
- Loads metadata where every entry has the same
last_activity(the moment of the call). - Sorts by reverse-
last_activity, which is stable insort_by_keybut the key is identical — so ordering falls back to insertion order fromread_dir(filesystem-defined, non-deterministic). - Truncates to
MAX_CONVERSATION_SENDERS, dropping a non-deterministic subset.
Master's previous code sorted by file mtime, which is at least deterministic and roughly approximates "most recently active." The PR's stated compatibility line — "Operators who set session_backend = "jsonl" continue to read/write JSONL files" — is true for read/write but not for the truncation choice during hydration.
Two clean options:
a. Override list_sessions_with_metadata on SessionStore to populate last_activity from file mtime:
fn list_sessions_with_metadata(&self) -> Vec<SessionMetadata> {
self.list_sessions()
.into_iter()
.map(|key| {
let last_activity = self.session_mtime(&key)
.and_then(|t| t.duration_since(std::time::UNIX_EPOCH).ok())
.and_then(|d| DateTime::<Utc>::from_timestamp(d.as_secs() as i64, 0))
.unwrap_or_else(Utc::now);
// ... message_count etc.
})
.collect()
}Keeps session_mtime load-bearing, JSONL hydration stays deterministic.
b. Delete session_mtime and document that JSONL backend has unspecified hydration order under MAX_CONVERSATION_SENDERS truncation. Reasonable because JSONL is legacy/secondary, but it should be called out so an operator who was relying on mtime ordering knows it's gone.
Either works. Currently the diff implies (a) is true (hydration sort is the same shape) but actually ships (b)'s behavior on JSONL.
2. session_mtime is dead code after this lands
crates/zeroclaw-infra/src/session_store.rs:134 is pub fn session_mtime(...). After this PR, the only caller (orchestrator/mod.rs:5801) goes away. pub fn with no callers will trigger dead_code once warnings are treated as errors elsewhere in the workspace. If you take option (1a) above, this becomes load-bearing again. If you take (1b), delete it in the same commit so the cleanup is atomic.
🔵 [suggestion]
3. Factory has no tests
make_session_backend has three meaningful branches:
"jsonl"returns aSessionStore-backedArc<dyn SessionBackend>"sqlite"returns aSqliteSessionBackend-backed one- anything else logs a warning and falls back to SQLite
A 15-line #[cfg(test)] block covering all three (plus assertion that the fallback case actually constructs a working backend, not just returns Ok) would lock the contract. Right now if someone changes the fallback default to JSONL the schema-mismatch wouldn't fail any test.
4. Fallback-on-unknown is a policy choice worth surfacing
The doc comment says "Unknown values fall back to SQLite with a warning so a typo in config never silently disables persistence." That's defensible — silent disable is the worst failure mode. But the alternative is rejecting at config validation time so the operator sees session_backend = "sqlte" flagged on startup before it falls back to a different backend than they configured. The current shape lets a typo silently change which on-disk format the operator's data lands in.
If config validation is supposed to be strict, this should bail! at startup with a list of valid values. If silent fallback-with-warning is the intended behavior, the doc comment is fine as-is — just worth being explicit that this was a chosen policy, not an accident. Consider adding a sentence like "Treat as policy: typos silently fall through rather than failing startup, on the principle that disabled persistence is worse than a wrong-but-working backend."
Summary
JSONL hydration regression in (1) is the only thing I'd want addressed before merge — everything else is hygiene. The bug fix itself is clean, the architectural shape is right, and the test coverage that already exists exercises the orphan-rollback path through the trait. Once JSONL ordering is either fixed or explicitly accepted as deferred, this is straightforward to approve.
Milestone alignment: v0.7.5 fits cleanly per the milestone's "release / web polishing / runtime stability" scope.
…htly rustfmt CI's Lint job runs nightly rustfmt; my local stable rustfmt 1.8.0 left the long assert!/assert_eq! lines and a use-ordering nit untouched, so the previous push failed the format check on the merge. cargo +nightly fmt --all is clean now.
WareWolf-MoonWall
left a comment
There was a problem hiding this comment.
Re-review — #6384 fix(runtime, channels): unify session backend behind one factory
Verdict: --approve
Reviewer: @WareWolf-MoonWall
Head reviewed: aca9d87 (current HEAD; checks passing)
Prior review head: eb9d5b6 (COMMENTED)
Take-stock
@Audacity88's CHANGES_REQUESTED (at eb9d5b6) is DISMISSED. My prior COMMENTED review is not a block. theonlyhennygod's COMMENTED review raised the same JSONL ordering regression I identified — the diff resolves it. No active CHANGES_REQUESTED. Reviewing the fixed implementation fresh.
Prior findings — resolution status
✅ [resolved] 🔴 Gateway (run_gateway ~L741) hardcodes SqliteSessionBackend::new()
The run_gateway session-init path now calls:
zeroclaw_infra::make_session_backend(&config.workspace_dir, &config.channels.session_backend)
SqliteSessionBackend is no longer directly imported in zeroclaw-gateway/src/lib.rs. The gateway now honours the same [channels].session_backend config field as the channel orchestrator and runtime tools. Setting session_backend = "jsonl" no longer creates the original #5769 failure mode on a different backend pairing.
✅ [resolved] 🔴 Existing JSONL sessions go dark on upgrade
open_sqlite_with_jsonl_import calls backend.migrate_from_jsonl(workspace_dir) immediately after opening the SQLite backend. Sessions are imported, each .jsonl file is renamed to .jsonl.migrated (rollback-recoverable), and migration errors are logged and skipped — the backend still opens. The startup is never blocked on a best-effort migration. The compatibility section in the PR body now accurately describes the upgrade path for default-config deployments.
✅ [resolved] 🟡 JSONL list_sessions_with_metadata sort regresses
SessionStore now overrides list_sessions_with_metadata using session_mtime. The implementation:
fn list_sessions_with_metadata(&self) -> Vec<SessionMetadata> {
self.list_sessions().into_iter().map(|key| {
let last_activity = self.session_mtime(&key)
.map(DateTime::<Utc>::from)
.unwrap_or_else(Utc::now);
SessionMetadata { last_activity, key, ... }
}).collect()
}JSONL hydration now sorts by actual file mtime rather than the Utc::now() fallback from the trait default. The pre-PR ordering semantics are preserved. session_mtime remains a live, load-bearing method — no dead code concern.
✅ [resolved] 🟡 Supersede attribution for #5770
The PR body includes Supersedes #5770 and the commit carries Co-authored-by: Lior-Dav. @Lior-Dav's diagnosis and the initial one-line tool-registration fix that this PR expands are credited on the record. No overlapping open PRs on the same issue.
✅ [resolved] 🟡 make_session_backend has no tests
Four tests now cover all three factory arms plus the migration path:
make_session_backend_jsonl_round_trips_through_session_storemake_session_backend_sqlite_round_trips_through_sqlite_dbmake_session_backend_unknown_value_falls_back_to_sqlitemake_session_backend_sqlite_imports_legacy_jsonl_on_first_open
The fallback-on-unknown behavior (SQLite + warning rather than startup error) is locked by the third test. The migration path is exercised end-to-end in the fourth, including the .jsonl.migrated rename assertion.
✅ [resolved] 🔵 Compatibility claim "Yes for the default config" misleads upgrading users
The compatibility section now clearly distinguishes new deployments, upgrading JSONL deployments, and explicit jsonl operators. The upgrade path is documented accurately.
What's working well
🟢 Factory architecture is the right fix for the root cause
The session gap in #5769 was a shared-state boundary problem — two call sites assuming different backends — not a per-call-site wrong default. make_session_backend solves it at the boundary. A future call site (a new gateway mode, an additional tool) picks up the correct backend automatically just by calling the factory. The fix has the right shape.
🟢 Error handling on migration follows FND-006 §4.1
migrate_from_jsonl errors are logged via tracing::warn! and skipped. The backend still opens. This is the correct failure mode: blocking startup on a best-effort migration would be worse than a partial migration. The log message includes the recovery instruction (switch to session_backend = "jsonl" if you need them visible immediately), which is what FND-006 §4.1 asks for — errors explain the path forward.
🟢 Type erasure in the orchestrator is load-bearing
Changing session_store from Option<Arc<SessionStore>> to Option<Arc<dyn SessionBackend>> is what makes the factory actually unify anything. Without it, make_session_backend would have been window dressing. The type change is what forces the fix to hold across future refactors.
Summary
All prior findings from my COMMENTED review at eb9d5b6 are resolved. The factory is correctly wired across the channel orchestrator, runtime tools, and gateway. JSONL hydration ordering is preserved. Migration runs automatically on first open. Tests cover all contract-relevant branches. CI is green.
Approving.
…ory (#6384) * fix(runtime/tools): use SQLite backend for session tools The gateway WebSocket handler persists sessions to SqliteSessionBackend (sessions/sessions.db, with gw_-prefixed keys) and the gateway REST API reads from the same place. The session tools (sessions_list, sessions_history, sessions_send) were initialized with the JSONL SessionStore, so they looked at sessions/*.jsonl files the gateway never writes. Result: gateway sessions are invisible to the agent. sessions_list returns "No active sessions found" even when the SQLite store has gw_ entries, and sessions_history returns "No messages found" for any gateway session ID. The agent then hallucinates session IDs because the tool returns empty data. Switch the registration site at crates/zeroclaw-runtime/src/tools/mod.rs to SqliteSessionBackend. Both backends implement the SessionBackend trait with the same new(&Path) signature, so this is a one-line swap plus a comment update. Closes #5769. * fix(runtime, channels): route session backend through one factory The original PR swapped the runtime tools to SqliteSessionBackend without touching the channel orchestrator, which still hardcoded the JSONL SessionStore. That trades one half-blindness for another: sessions written by channels (Telegram, Slack, Discord, ...) become invisible to the agent's sessions_list / sessions_history tools. Introduce zeroclaw_infra::make_session_backend(workspace_dir, &str) that picks JSONL or SQLite based on [channels].session_backend (which the schema already exposes; nothing was reading it). Both call sites (channel orchestrator session_store, runtime tool registration) now go through this factory, so the configured backend is the single source of truth and channel + WS sessions both surface in the same store. The orchestrator's session_store field switches from Option<Arc<SessionStore>> to Option<Arc<dyn SessionBackend>>. The mtime-based hydration sort is replaced with the trait's list_sessions_with_metadata() last_activity field, which both backends already populate, so there is no SessionStore-specific method on the hot path. Closes #5769 * fix(sessions): wire gateway through factory, migrate JSONL, restore mtime sort Per @Audacity88 / @WareWolf-MoonWall review: 1. Gateway `/ws/chat` session persistence now routes through `make_session_backend(workspace_dir, &config.channels.session_backend)` instead of unconditionally opening `SqliteSessionBackend`. With `session_backend = "jsonl"` the gateway, channels, and tools all read the same store; the previous gateway-only SQLite hardcode would have recreated the original session-split bug under that config. 2. `make_session_backend` opens SQLite via a new `open_sqlite_with_jsonl_import` helper that calls `SqliteSessionBackend::migrate_from_jsonl` on first open. Legacy `sessions/*.jsonl` files get imported and renamed to `*.jsonl.migrated`, so an upgrading deployment on the default config doesn't lose channel history. Migration errors log + skip rather than blocking startup. 3. `SessionStore` now overrides `list_sessions_with_metadata` to fill `last_activity` from `session_mtime` instead of inheriting the trait default (which stamped every key with `Utc::now()`). The orchestrator's hydration sort now picks the most-recent JSONL sessions deterministically when truncating to MAX_CONVERSATION_SENDERS. 4. New tests cover all three `make_session_backend` arms (jsonl, sqlite, unknown→sqlite fallback) plus the JSONL→SQLite import flow with the `*.jsonl.migrated` rollback file. * style(infra): rewrap make_session_backend test asserts to satisfy nightly rustfmt CI's Lint job runs nightly rustfmt; my local stable rustfmt 1.8.0 left the long assert!/assert_eq! lines and a use-ordering nit untouched, so the previous push failed the format check on the merge. cargo +nightly fmt --all is clean now. c6a0ed7
…ory (zeroclaw-labs#6384) * fix(runtime/tools): use SQLite backend for session tools The gateway WebSocket handler persists sessions to SqliteSessionBackend (sessions/sessions.db, with gw_-prefixed keys) and the gateway REST API reads from the same place. The session tools (sessions_list, sessions_history, sessions_send) were initialized with the JSONL SessionStore, so they looked at sessions/*.jsonl files the gateway never writes. Result: gateway sessions are invisible to the agent. sessions_list returns "No active sessions found" even when the SQLite store has gw_ entries, and sessions_history returns "No messages found" for any gateway session ID. The agent then hallucinates session IDs because the tool returns empty data. Switch the registration site at crates/zeroclaw-runtime/src/tools/mod.rs to SqliteSessionBackend. Both backends implement the SessionBackend trait with the same new(&Path) signature, so this is a one-line swap plus a comment update. Closes zeroclaw-labs#5769. * fix(runtime, channels): route session backend through one factory The original PR swapped the runtime tools to SqliteSessionBackend without touching the channel orchestrator, which still hardcoded the JSONL SessionStore. That trades one half-blindness for another: sessions written by channels (Telegram, Slack, Discord, ...) become invisible to the agent's sessions_list / sessions_history tools. Introduce zeroclaw_infra::make_session_backend(workspace_dir, &str) that picks JSONL or SQLite based on [channels].session_backend (which the schema already exposes; nothing was reading it). Both call sites (channel orchestrator session_store, runtime tool registration) now go through this factory, so the configured backend is the single source of truth and channel + WS sessions both surface in the same store. The orchestrator's session_store field switches from Option<Arc<SessionStore>> to Option<Arc<dyn SessionBackend>>. The mtime-based hydration sort is replaced with the trait's list_sessions_with_metadata() last_activity field, which both backends already populate, so there is no SessionStore-specific method on the hot path. Closes zeroclaw-labs#5769 * fix(sessions): wire gateway through factory, migrate JSONL, restore mtime sort Per @Audacity88 / @WareWolf-MoonWall review: 1. Gateway `/ws/chat` session persistence now routes through `make_session_backend(workspace_dir, &config.channels.session_backend)` instead of unconditionally opening `SqliteSessionBackend`. With `session_backend = "jsonl"` the gateway, channels, and tools all read the same store; the previous gateway-only SQLite hardcode would have recreated the original session-split bug under that config. 2. `make_session_backend` opens SQLite via a new `open_sqlite_with_jsonl_import` helper that calls `SqliteSessionBackend::migrate_from_jsonl` on first open. Legacy `sessions/*.jsonl` files get imported and renamed to `*.jsonl.migrated`, so an upgrading deployment on the default config doesn't lose channel history. Migration errors log + skip rather than blocking startup. 3. `SessionStore` now overrides `list_sessions_with_metadata` to fill `last_activity` from `session_mtime` instead of inheriting the trait default (which stamped every key with `Utc::now()`). The orchestrator's hydration sort now picks the most-recent JSONL sessions deterministically when truncating to MAX_CONVERSATION_SENDERS. 4. New tests cover all three `make_session_backend` arms (jsonl, sqlite, unknown→sqlite fallback) plus the JSONL→SQLite import flow with the `*.jsonl.migrated` rollback file. * style(infra): rewrap make_session_backend test asserts to satisfy nightly rustfmt CI's Lint job runs nightly rustfmt; my local stable rustfmt 1.8.0 left the long assert!/assert_eq! lines and a use-ordering nit untouched, so the previous push failed the format check on the merge. cargo +nightly fmt --all is clean now. c6a0ed7
Summary
The session tools (
sessions_list,sessions_history,sessions_send,sessions_current) were registered against the JSONLSessionStore, but the gateway WebSocket handler persisted sessions toSqliteSessionBackend. Any session created via/ws/chatwas therefore invisible to the agent's own session tools (sessions_listreturned "No active sessions found" for valid gateway session IDs, and the agent then hallucinated session IDs).Naively swapping the tool registration to SQLite would have traded one half-blindness for another: the channel orchestrator was opening JSONL too, so channel sessions (Telegram, Slack, Discord, ...) would have stopped showing up in the tools.
Introduce
zeroclaw_infra::make_session_backend(workspace_dir, &str)that picks JSONL or SQLite based on the existing[channels].session_backendconfig field (which the schema already exposed; nothing was reading it). All three call sites — the channel orchestrator, the runtime tool registration, and the gateway/ws/chatsession store — now go through this factory. The configured backend becomes the single source of truth across channel, gateway, and tool reads.The orchestrator's
session_storefield switches fromOption<Arc<SessionStore>>toOption<Arc<dyn SessionBackend>>. The mtime-based hydration sort is replaced with the trait'slist_sessions_with_metadata().last_activityfield;SessionStoreoverrides the trait default to filllast_activityfromsession_mtimeso JSONL hydration ordering is preserved.When the SQLite backend opens for the first time and legacy
sessions/*.jsonlfiles exist on disk,make_session_backendcallsSqliteSessionBackend::migrate_from_jsonlto import them and renames each source file to*.jsonl.migratedso a rollback is recoverable. Migration errors are logged and skipped; the backend still opens.Closes #5769. Supersedes #5770 — the original one-line tool-registration fix is folded into the broader factory in this PR.
Validation Evidence
Security & Privacy Impact
Compatibility
session_backend = "sqlite"(the default) opens the SQLite store; channel, gateway, and tool reads share it.make_session_backendimportssessions/*.jsonlintosessions/sessions.dband renames each source file to*.jsonl.migrated. Sessions remain visible to the channel orchestrator and to the new tools after upgrade.*.jsonl.migratedfiles stay on disk so an operator who needs to roll back can rename them back.session_backend = "jsonl": unchanged behaviour. JSONL files keep their existing read/write semantics; the gateway WS path now also reads/writes JSONL, closing the original Session tools (sessions_list/sessions_history) use JSONL backend instead of SQLite — returns empty for gateway sessions #5769 split for that backend choice.[channels].session_backendis now load-bearing for all three call sites — channels, gateway, and runtime tools — instead of being mostly ignored.Rollback
git revert <merge-sha>returns to the pre-PR shape. Imported sessions remain readable in the SQLite db; the renamed*.jsonl.migratedfiles can be renamed back to*.jsonlif the operator wants to fall back to the JSONL backend.Co-authored-by: Lior-Dav Lior-Dav@users.noreply.github.com